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Abstract. Considering the high heterogeneity of the ontologies pub¬ 
lished on the web, ontology matching is a crucial issue whose aim is to 
establish links between an entity of a source ontology and one or several 
entities from a target ontology. Perfectible similarity measures, consid¬ 
ered as sources of information, are combined to establish these links. The 
theory of belief functions is a powerful mathematical tool for combining 
such uncertain information. In this paper, we introduce a decision pro¬ 
cess based on a distance measure to identify the best possible matching 
entities for a given source entity. 
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1 Introduction 

This paper proposes a decision rule based on a distance measure. This rule 
calculates the distance between a combined mass function and a categorical mass 
function and keep the hypotheses with the lowest distance. We propose this rule 
for its ability to give decision on composite hypotheses as well as its convenience 
to our domain of application, namely the semantic web and particularly the 
ontology matching where decision making is an important step. 

Ontology matching is the process of finding for each entity of a source ontol¬ 
ogy Oi its corresponding entity in a target ontology 02- This process can focus 
on finding simple mappings (1:1) or complex mappings (l:n or ml). The first 
consists in matching only one entity of Oi with only one entity of O 2 whereas 
the second consists in finding either for one entity of Oi its multiple correspon¬ 
dences of entities in O 2 or matching multiple entities of Oi with only one entity 
of 02- We are interested in this paper in finding simple mappings as well as the 
complex one of the form (l:n). 

The matching process is performed through the application of matching tech¬ 
niques which are mainly based on the use of similarity measures. Since no similar¬ 
ity measure applied individually is able to give a perfect alignment, the exploita¬ 
tion of the complementarity of different similarity measures can yield to a better 
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alignment. Combining these similarity measures may raise conflicts between the 
different results which should be modeled and resolved. 

We suggest to use the theory of belief functions ma as a tool for modeling 
the ontology matching and especially for combining the results of the different 
similarity measures. Due to the fact that we are working on an uncertain aspect 
and we are interested in finding complex matching which can be viewed as finding 
composite hypotheses formed from entities of two ontologies, we suggest to apply 
our proposed decision rule on the combined information and to choose for each 
entity of the source ontology, the entities of the target ontology with the lowest 
distance. 

The remainder of this paper is organized as follows: we are interested in sec¬ 
tion 2 in defining the ontology matching process. In section 3, we recall the basic 
concepts underlying the theory of belief functions. In section 4, we present our 
decision rule based on a distance measure. Section 5 is devoted to the description 
of the credibilistic decision process for matching ontologies as well as the appli¬ 
cation of our proposed decision rule. Section 6 discusses an overview of some 
ontology matching approaches dealing with uncertainty. Finally, we conclude in 
section 7 and present future work. 


2 Ontology Matching 

The open nature of the semantic web [5] tends to encourage the development, for 
a domain of interest, of heterogeneous ontologies which differ from each other at 
the terminological level and/or the representational one. In order to mitigate the 
effect of semantic heterogeneity and to assure interoperability between applica¬ 
tions that make use of these ontologies, a key challenge is to define an efficient 
and reliable matching between ontologies [7]. 

Formally, ontology matching is defined as a function A = f(Oi, O 2 , A’, p, r). 
In fact, from a pair of ontologies to match Oi and O 2 , an input alignment A ’, a 
set of parameters p, a set of oracles and resources r, the function / returns an 
alignment A between these ontologies. We note that parameters and resources 
refer to thresholds and external resources respectively. 

With the new vision of the web that tends to make applications understand¬ 
able by machines, an automatic and semi automatic discovery of correspondences 
between ontologies is required. The reader may refer to [7] for an exhaustive state 
of the art of ontology matching techniques. 

3 The Theory of Belief Functions 

3.1 Definitions 

The frame of discernment 0 = {0i, 02, ■ ■ ■ ,0n} is a finite non empty set of n 
elementary and mutually exclusive and exhaustive hypotheses related to a given 
problem. The power set of 0, denoted by 2® is defined as the set of singleton 
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hypotheses of 0, all possible disjunctions of these hypotheses as well as the 
empty set. 

The basic belief assignment {hha) is the mapping from elements of the power 
set 2® onto [0,1] that satisfies: 


m 


= 0, ^ m{A) = 1. 


( 1 ) 


AC0 


The value m(A) quantifies the part of belief exactly committed to the subset 
A of 6). 

A focal element A is an element of 2® such that m{A) ^ 0. 

From a given bba , the corresponding credibility and plausibility functions 
are respectively defined as: 


bel{A) = m{B). 


and 


( 2 ) 

( 3 ) 


The value hel(A) expresses the total belief that one allocates to A whereas 
the pl(A) quantifies the maximum amount of belief that might support a subset 
A of 6). 

Some special bbas are defined in the theory of belief functions. Among them, 
the categorical bba which is a bba with a unique focal element different from the 
frame of discernment O and the empty set 0, and which is defined as mx (^) = 1- 


3.2 Combination of Belief Functions 

Let Si and S 2 be two distinct and independent sources providing two different 
bbas TOi and m 2 defined on the same frame of discernment O. These two bbas 
are combined by either the conjunctive rule of combination or the disjunctive 
rule. 

— The conjunctive rule of combination is used when the two sources are fully 
reliable. This rule is defined in m as : 


mi@ 2 iA) = rni{B) x m 2 {C). 


( 4 ) 


Br\C=A 


The conjunctive rule can be seen as an unnormalized Dempster’s rule of 
combination [?] which is defined by: 


Y, mi{B)xm2{C) 


TO1©2(^) = < 


BnC=A 


1 - Y, ”^1(4?) ^ W 2 (C') 

Bnc=0 

0 if A = 


VA c 0, A 7 ^ 


( 5 ) 
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The Dempster’s rule of combination is normalized through 1— mi (B) x 

Bnc=0 

7712 ( 6 ') and it works under the closed world assumption where all the possible 
hypotheses of the studied problem are supposed to be enumerated on 0. 

- The disjunctive rule is used when at least one of the sources is reliable 
without knowing which one of them. It is defined in [U by: 

mi(Q)2iA) = ^ 7771 (B) X 7712(6'). (6) 

BUC=A 


3.3 Decision Making 


Combining information provided by the different sources leads to a global one 
that has to be analyzed in order to choose the most likely hypothesis. Decision 
making can be held in two different ways. 


— Decision on singletons: It means that the most likely solution to a given 
problem is one of the hypothesis of 0. To determine this most likely solution, 
one may: 

• Maximize the credibility: It consists on retaining the most credible hy¬ 
pothesis by giving the minimum of chances to each of the disjunctions. 

• Maximize the plausibility: It consists on retaining the most plausible 
hypothesis by giving the maximum of chances to each of the singletons. 

• Maximize the pignistic probability: It was introduced in m and it is 
the common used decision function because it represents a compromise 
between the maximum of credibility and the maximum of plausibility. 
The pignistic probability consists on choosing the most probable sin¬ 
gleton hypothesis by dividing the mass attributed to each hypothesis, 
different from the singleton hypothesis, by the hypotheses composing it. 
It is given for each A G 2®, A 7 ^ 0 by: 


betP{X) 


E 

A£2^,X£A 


m{A) 

\A\{l-mmy 


(7) 


where |A| represents the cardinality of A. 

— Decision on unions of singletons: Few works were interested in making de¬ 
cision on composite hypotheses (0, [5, [5]). The approach proposed in [T] 
helps to choose a solution of a given problem by considering all the elements 
contained in 2®. This approach weights the decision functions listed previ¬ 
ously by an utility function depending on the cardinality of the elements. 
For each A G 2® we have: 


A = argmax{md{X)pl{X)) 
where is a mass defined by: 

md{X) = KdXx{j^) 


( 8 ) 

(9) 
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r is a parameter in [0,1] for choosing a decision. When r is equal to 0 it reflects a 
total indecision and when it is equal to 1 it means that we decide on a singleton. 
The value Ajy is used to integrate the lack of knowledge about one of the elements 
X of 2®. Kd is a normalization factor. 

4 Decision Rule Based on a Distance Measure 

We aim in this paper to propose a decision rule helping us to choose the most 
likely hypothesis for a given problem after combining the information provided 
by different sources of information, i.e. bbas. This rule, based on a distance 
measure, is inspired from m and is defined as: 

A = argmin{d{m,mx)) (10) 

The proposed rule aims at calculating the distance between m which is a com¬ 
bined bba (obtained after applying a combination rule) and mx is the categorical 
bba of X such that X e 2®. The most likely hypothesis to choose is the hypoth¬ 
esis whose categorical bba is the nearest to the combined bba. 

In order to make a decision: 

— First, we have to identify the elements for which we have to construct the 
categorical bba. In fact, we choose to work on elements of 2® such that the 
cardinality of the element is less or equal to 2. This filtering is due to the 
fact that we want to limit the number of elements to be considered especially 
with a power set 2® of large cardinality. 

— Then, we construct the categorical bba for each of the selected element. 

— Finally, we calculate the distance between the combined bba and each of 
the categorical bbas. The minimum distance is kept and our decision corre¬ 
sponds to the categorical bba’s element having the lowest distance with the 
combined bba. 

For the calculation of the distance between the bbas, we use the Jousselme 
distance [S] which is specific to the theory of belief functions because of the 
matrix D defined on 2®. This distance has the advantage to take into account 
the cardinality of the focal elements. This distance is defined for two bbas mi 
and m 2 as follows: 


d(mi, m 2 ) = Y-(mi — m 2 )‘^(mi — m 2 ) (11) 

where D is a matrix based on Jaccard distance as a similarity measure between 
focal elements. This matrix is defined as: 


D{A,B) 


1 if A=B=0 


( 12 ) 
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To illustrate the proposed decision rule, we take the following example. Let’s 
consider the frame of discernment 0 = {9i,92, 6 * 3 }- The list of elements for which 
we have to construct their categorical bba are {0i, ^ 2 , ^ 3 , 9i U 02 ,0i U 03,02 U 03 }. 
Suppose that we have two sources Si and S 2 providing two different bbas toi 
and 7712 defined on the frame of discernment 0. The table 1 illustrates these two 
bbas as well as their combined bba obtained after applying the Dempster’s rule 
of combination. 


Table 1. bbal and bba2 and their combined bba 


bbal 

bba2 

combined bba 

mi{0i) — 0.4 
mi(02 U03) = 0.2 
mi{0) = 0.4 

m2(02) = 0.2 
7712 ( 0 ) = 0.8 

7Tlcom6(0l) = 0.3478 
mcomb{92) = 0.1304 
mcomb{0) = 0.3478 
77lcom6(02 U 63 ) = 0.1739 


The application of our proposed decision rule gives the results illustrated 
in table 2 where it shows for every element the distance obtained between the 
categorical bba of this element and the combined bba. 


Table 2. Results of the proposed decision rule 


Element 

Distance 

01 

0.537 

02 

0.647 

03 

0.741 

0i U O 2 

0.472 

01 U 03 

0.536 

02 U 03 

0.529 


Based on the results obtained in table 2, the most likely hypothesis to choose 
is the element 0i U 02 . 


5 Credibilistic Decision Process for Ontology Matching 

In [ 6 ], we proposed a credibilistic decision process for ontology matching. In the 
following, we describe this process occurring mainly in three steps, then we will 
apply the proposed decision rule in order to find a correspondence for a given 
entity of the source ontology. 
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1- Matching ontologies: We apply three name-based techniques (Levenshtein 
distance, Jaro distance and Hamming distance) for matching two ontologies Oi 
and O 2 related to conference organizatiorQ. We have the following results: 


Table 3. Results of matching the entity ConferenceMember of Oi with entities of O 2 


method 

62 ^ O 2 

n 

Levenshtein 

Conference-fees 

0.687 

Jaro 

Conference 

0.516 

Hamming 

Conference 

0.625 


This table shows that using the levenshtein distance, the entity Conference- 
Member matches to Conference-fees with a confidence value of 0.687. 

2- Modeling the matching under the theory of belief functions: We are 

interested here in modeling the matching results obtained in the previous step 
under the theory of belief functions. 

~ Frame of discernment: It contains all the entities of the target ontology O 2 
for which a corresponding entity in the source ontology Oi exists. 

— Source of information: Every correspondence established by one of the match¬ 
ing techniques is considered as an information given by a source. 

— Basic Belief Assignments (bba): Once we get all the correspondences, we 

keep only those where an entity source ei S Oi has a correspondence 
when applying the three techniques. Then, we construct for each of the 
selected correspondence its mass function. The similarity measure obtained 
after applying a matching technique is interpreted as a mass. Due to the 
fact that for a source of information, the sum of mass functions has to be 
equal to 1, a mass will be affected to the total ignorance. Let’s take the 
results illustrated in Table 3. In this table, we have information provided 
by three different sources respectively denoted by and 

where ei = ConferenceMember. The bba related to the source is: 

(Conference-fees) = 0.687 and mg^i (O) = 1 — 0.687 = 0.313. The 
bbas for the other sources are constructed in the same manner. 

— Combination: Let’s resume the obtained bbas of the three sources. We have: 

• mg‘i (Conference-fees) = 0.687 and (0) — 0.313 

• mc«i (Conference) = 0.516 and mc«i (0) = 0.484 

jaro jaro 

• mc»i (Conference) = 0.625 and mc»i (0) = 0.375 

hamming hamming 

Once we apply the Dempster’s rule of combination, we obtain the following 
results: 

• m^l^f^(Conference-fees) = 0.2849 

• mll^i^(Conference) = 0.5853 
■ <lm b(.e>) = 0.1298 

http ://oaei.ontologymatching.org/2013/conference/index.html 
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3- Decision Making: Based on the combined bba which takes into account 
all the information provided by the different sources, we will be able in this 
step to choose for each entity of the source ontology its corresponding in the 
target ontology. For example, for the entity ConferenceMember, we will be able 
to decide if we have to match it with Conference-fees or Conference or simply we 
will not have a precise decision but rather an uncertain one where we can match 
ConferenceMember to Conference-feesUConference. We are interested in our 
credibilistic process to get an uncertain decision. For this purpose, we apply our 
proposed decision rule. First, we construct the categorical bba of elements having 
a cardinality equal to 2. For the example illustrated in figure 1 we have: 

— Tuscan f erence-document U conference) = 1 

— m{conference U conference-fees) = 1 

— ^{conference-Volume U committee) = 1 


Then we calculate the distance between the combined bba obtained previ¬ 
ously and each of the categorical bba. Our best alignment corresponds to the 
nearest element to the combined bba in other words the element whose cate¬ 
gorical bba has the minimum distance with the combined bba. For the entity 
ConferenceMember of the ontology Oi we find conference-fees U conference 
with a distance equal to 0.52. This process is repeated for each entity of the 
source ontology in order to identify the most significant correspondences in the 
target ontology. 

6 Related Works 

Only few ontology matching methods have considered that dealing with uncer¬ 
tainty in a matching process is a crucial issue. We are interested in this section 
to present some of them where the probability theory m and the Dempster- 
Shafer theory (I3],[in],[is]) are the main mathematical models used. In m, 
the authors proposed an approach for matching ontologies based on bayesian 
networks which is an extension of the BayesOWL. The BayesOWL consists in 
translating an OWL ontology into a bayesian network (BN) through the appli¬ 
cation of a set of rules and procedures. In order to match two ontologies, first 
the source and target ontologies are translated into BNi and BN 2 respectively. 
The mapping is processed between the two ontologies as an evidential reason¬ 
ing between BNi and BN 2 . The authors assume that the similarity information 
between a concept Ci from a source ontology and a concept C 2 from a target 
ontology is measured by the joint probability distribution P(C'i, C 2 ). 

In [3], the author viewed ontology matching as a decision making process 
that must be handled under uncertainty. He presented a generic framework that 
uses Dempster-Shafer theory as a mathematical model for representing uncertain 
mappings as well as combining the results of the different matchers. Given two 
ontologies Oi and O 2 , the frame of discernment represents the Cartesian product 
e X O 2 where each hypothesis is the couple < e, > such as e e Oi and 
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Ci € 02- Each matcher is considered as an expert that returns a similarity 
measure converted into a basic belief mass. The Dempster rule of combination is 
used to combine the results provided by a matcher. The pairs with plausibility 
and belief below a given threshold are discarded. The remaining pairs represent 
the best mapping for a given entity. 

Although, the authors in m handle uncertainty in the matching process, 
their proposal differ from that proposed in [5]. In fact, they use the Dempster- 
Shafer theory in a specific context of question answering where including un¬ 
certainty may yield to better results. Not like in [5], they did not give in depth 
how the frame of discernment is constructed. In addition to that, uncertainty 
is handled only once the matching is processed. In fact, the similarity matrix 
is constructed for each matcher. Based on this matrix, the results are modeled 
using the theory of belief functions and then they are combined. 

In [16] , the authors focused on integrating uncertainty when matching ontolo¬ 
gies. The proposed method modeled and combined the outputs of three ontology 
matchers. For an entity e S Oi, the frame of discernment 0 is composed of map¬ 
pings between e and all the concepts in an ontology O 2 ■ The different similarity 
values obtained through the application of the three matchers are interpreted 
as mass values. Then, a combination of the results of the three matchers is 
performed. 

7 Conclusion and Perspectives 

In this paper, we proposed a decision rule based on a distance measure. This 
decision rule helps to choose the most likely hypothesis for a given problem. 
It is based on the calculation of the distance between a combined bba and a 
categorical bba. We apply this rule in our proposed credibilistic decision process 
for the ontology matching. First, we match two ontologies. Then, the obtained 
correspondences are modeled under the theory of belief functions. Based on 
the obtained results, a decision making is performed by applying our proposed 
decision rule. 

In the future, we aim at applying other matching techniques. We are inter¬ 
ested also in constructing an uncertain ontology based on the obtained results 
after a decision making and handling experimentations to qualitatively assess 
the relevance of our approach. 
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